Routing problems and Markovian decision processes
نویسندگان
چکیده
منابع مشابه
Non-Deterministic Policies in Markovian Decision Processes
Markovian processes have long been used to model stochastic environments. Reinforcement learning has emerged as a framework to solve sequential planning and decision-making problems in such environments. In recent years, attempts were made to apply methods from reinforcement learning to construct decision support systems for action selection in Markovian environments. Although conventional meth...
متن کاملNon-Markovian Policies in Sequential Decision Problems
In this article we prove the validity of the Dellman Optimality Equa tion a.nd related results for sequential decision problems with a general recursive structure. The characteristic feature of our approach is that also non-Markovian policies are taken into account. The theory is moti vated by some experiments with a learning robot.
متن کاملConstrained Markovian decision processes: the dynamic programming approach
We consider semicontinuous controlled Markov models in discrete time with total expected losses. Only control strategies which meet a set of given constraint inequalities are admissible. One has to build an optimal admissible strategy. The main result consists in the constructive development of optimal strategy with the help of the dynamic programming method. The model studied covers the case o...
متن کاملReinforcement Learning Algorithms for Average-Payoff Markovian Decision Processes
Reinforcement learning (RL) has become a central paradigm for solving learning-control problems in robotics and artificial intelligence. R L researchers have focussed almost exclusively on problems where the controller has to maximize the discounted sum of payoffs. However, as emphasized by Schwartz (1$X)3), in many problems, e.g., those for which the optimal behavior is a limit cycle, it is mo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Mathematical Analysis and Applications
سال: 1985
ISSN: 0022-247X
DOI: 10.1016/0022-247x(85)90097-6